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ABSTRACT 


We searched for an isotropic stochastic gravitational wave background in the 
second data release of the International Pulsar Timing Array, a global collabora- 
tion synthesizing decadal-length pulsar-timing campaigns in North America, Europe, 
ao Australia. In our reference search for a power law strain spectrum of the form 

he = A(f /1yr-!)*, we found strong evidence for a spectrally-similar low-frequency 
stochastic process of amplitude A = 3.8*92 x 10715 and spectral index a = —0.5+0.5, 
where the uncertainties represent 9596 credible regions, using information from the 
auto- and cross-correlation terms between the pulsars in the array. For a spectral in- 
dex of a = —2/3, as expected from a population of inspiralling supermassive black hole 
binaries, the recovered amplitude is A = 2.81}: x 10715. Nonetheless, no significant 
evidence of the Hellings-Downs correlations that would indicate a gravitational- wave 
origin was found. We also analyzed the constituent data from the individual pulsar 
timing arrays in a consistent way, and clearly demonstrate that the combined inter- 
national data set is more sensitive. Furthermore, we demonstrate that this combined 
data set produces comparable constraints to recent single-array data sets which have 
more data than the constituent parts of the combination. Future international data 
releases will deliver increased sensitivity to gravitational wave radiation, and signifi- 
cantly increase the detection probability. 


Key words: gravitational waves — pulsars:general — supermassive back holes — 
methods: data analysis — methods: statistical techniques 
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2 IPTA 


1 INTRODUCTION 


Inspiralling supermassive black hole binaries (SMBHBs) 
with masses larger than 10’ Mo are expected to generate 
the strongest gravitational-wave (GW) signals in the Uni- 
verse. The incoherent superposition of all of these inspi- 
ralling SMBHBs should generate a stochastic GW back- 
ground (GWB) that is the strongest in the nanohertz fre- 
quency band (e.g., Rajagopal & Romani 1995; Jaffe & 
Backer 2003; Sesana et al. 2008; Burke-Spolaor et al. 2019). 
Other sources that could also produce a stochastic back- 
ground in the nanohertz band are cosmic strings (e.g., Ólmez 
et al. 2010), cosmological phase transitions, and a primordial 
background produced by quantum fluctuations in the grav- 
itational field in the early universe (e.g., Grishchuk 2005; 
Lasky et al. 2016). For comparison, the Laser Interferome- 
ter Gravitational-Wave Observatory (LIGO) and the Virgo 
Collaboration, which are terrestrial GW detectors and have 
detected GWs from merging stellar mass black holes and 
neutron stars (e.g., Abbott et al. 2019, 2021), are only sen- 
sitive to GW signals that are ten orders of magnitude higher 
in frequency than PTAs. 


A nanohertz GWB can be detected using a precisely 
timed ensemble of millisecond pulsars (Sazhin 1978; De- 
tweiler 1979), called a pulsar timing array (PTA, Foster & 
Backer 1990). The GWs distort the space-time between the 
Earth and pulsars, changing their proper distance, thereby 
leading to a measurable deviation of the pulsar pulse ar- 
rival times. Since such effects cannot be detected with con- 
fidence using only one pulsar, PTAs leverage the imprint of 
spatially-correlated timing deviations between pulsars which 
are separated by kiloparsec distances across the galaxy, yet 
are subject to the common influence of the GWB. 


An isotropic GWB manifests itself as a long timescale, 
low-frequency (or red) common signal across the pulsars in 
a PTA. This common signal is characterized by the com- 
mon spectrum and the inter-pulsar spatial correlations. For 
an isotropic GWB these spatial correlations are unique, re- 
ferred to as the Hellings & Downs (1983) (HD) correlations, 
and thus are considered to be the “smoking gun” signature 
for the presence of a GWB (Tiburzi et al. 2016) in any PTA 
data set. The spectral amplitude of this common signal is 
determined by the characteristic strain, he( f), of the GWB, 
which itself is a function of the physics sourcing the GWB 
(e.g., SMBHB masses, merger timescale, and number den- 
sity) (e.g., Sesana 2013a; Kelley et al. 2017; Chen et al. 
2019). Thus, precise spectral characterization of the GWB 
will allow us to extract the underlying astrophysics of the 
background, as well as distinguish between different sources 
of the GWB (e.g., Pol et al. 2021). 


The ability to detect GWs relies on, among other things, 
the number of pulsars available to cross-correlate in GWB 
searches, and on the length of each pulsar data set (Siemens 
et al. 2013). Improvements in both of these parameters in- 
creases the detection significance of the GWB signal which 
in turn allows for better constraints on the parameters of the 
GWB spectrum (Pol et al. 2021). Hence, international efforts 
spanning decades from the European Pulsar Timing Array 
(EPTA, Desvignes et al. 2016), North American Nanohertz 
Observatory for Gravitational Waves (NANOGrav, Arzou- 
manian et al. 2016), and the Parkes Pulsar Timing array 
(PPTA, Manchester et al. 2013), as well as newer PTAs 


such as the Indian Pulsar Timing Array (InPTA, Joshi et al. 
2018), Chinese Pulsar Timing Array (CPTA, Lee 2016), and 
with the MeerK AT Interferometer in South Africa (Bailes 
et al. 2020) share and combine their data to form the Inter- 
national PTA (IPTA, Hobbs et al. 2010). 

In this spirit of international collaboration the IPTA 
has produced two data sets to date. The first IPTA data re- 
lease (DR1, Verbiest et al. 2016) consisted of 44 millisecond 
pulsars and yielded no conclusive detection of a GWB. The 
second IPTA data release (DR2, Perera et al. 2019) consists 
of 65 pulsars and is the focus of this analysis. The pulsars 
in DR2 have data sets spanning 0.5 — 30 years. For the first 
time, we process the data subsets from each individual PTA 
and search for a GWB in a self-consistent way, thus enabling 
us to make a fair comparison of respective PTA constraints. 

Recently, a spatially uncorrelated (pulsar-weighted- 
average) spectrally similar common process or common- 
spectrum process (CP) was detected in the NANOGrav 12.5- 
year data set (Arzoumanian et al. 2020), the second data re- 
lease of the Parkes Pulsar Timing Array (Goncharov et al. 
2021a), and the EPTA six-pulsar data set of the second data 
release (Chen et al. 2021b). The process is modeled as an 
additional time-correlated term with the same power spec- 
trum in all of the pulsars. However, there is little evidence 
to support the existence of spatial HD correlations in any of 
these data sets. We compare the IPTA DR2 constraints on 
the GWB with those obtained from these analyses. 

The paper is organized as follows: in Section 2 we give 
an overview of the second IPTA data release, hereafter re- 
ferred to as DR2. We describe our data analysis methods in 
Section 3, and give our results in Section 4. Caveats and im- 
plications of our analysis and results are discussed in Section 
5, including the astrophysical interpretation of a potential 
GWB. The conclusion is given in Section 6. 


2 IPTA DATA RELEASE 2 


IPTA DR2 includes a combination of timing data from the 
following individual PTA data releases: the EPTA data re- 
lease 1.0 (Desvignes et al. 2016), the NANOGrav 9-year 
data set (Arzoumanian et al. 2015), and the PPTA first 
data release (Manchester et al. 2013) and its extended ver- 
sion (Reardon et al. 2016). The EPTA data set includes 
high-precision timing data from 42 MSPs obtained with the 
largest radio telescopes in Europe — Effelsberg telescope, 
Lovell telescope, Nangay telescope, and Westerbork Synthe- 
sis telescope — covering data from 1996 to 2015 with a time 
baseline between 7-18 yr. In addition to these data, archival 
timing data of PSR J1939--2134 since 1994 was included. 
The NANOGrav 9-year data set includes high-precision tim- 
ing observations from 37 MSPs obtained with the Robert 
C. Byrd Green Bank Telescope and the Arecibo telescope, 
spanning a time baseline between 0.6-9.2 yr, covering the 
data from 2004 to 2013. In addition, the long-term tim- 
ing data of PSR J1713+0747 from Zhu et al. (2015) and 
the data of PSRs J18574-0943 and J19394-2134 from 1984 
through 1992 (Kaspi et al. 1994) were included. The PPTA 
data set includes high-precision timing observations from 20 
MSPs obtained with the Parkes radio telescope (also known 
as Murriyang) from 2004 to 2011. IPTA DR2 also included 
single frequency band (1.4 GHz/L-band) Parkes telescope 
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legacy data obtained since 1994. The additional 3.0 GHz 
timing data reported in Shannon et al. (2015) for PSRs 
J0437—4715, J1744—1134, J17134-0747, and J1909—3744 
were also included in the data set. In total, the timing 
data from 65 MSPs were included in IPTA DR2, which 
has 21 more source than the IPTA DR1 (Verbiest et al. 
2016). There are 27 and 7 MSPs in IPTA DR2 with a tim- 
ing baseline >10 yr and >20 yr, respectively. All pulsars 
were observed at multiple frequencies. All EPTA and PPTA 
observations were averaged in time and frequency to ob- 
tain a single time-of-arrival (TOA) for each receiver and 
observation. The NANOGrav observations were averaged in 
time and included sub-band information, i.e., averaged in 
frequency to maintain a frequency resolution ranging from 
1.5 to 12.5 MHz depending on the receiver and backend in- 
strument combination, resulted in a single TOA for each 
frequency channel. More details about the constituent PTA 
data sets can be found in Perera et al. (2019). 

The different data sets for a given pulsar in IPTA DR2 
were combined by fitting for time offsets, referred to as 
JUMPs, in the timing model to account for any system- 
atic delays between data sets. The highest weighted data set 
with the lowest sum of TOA uncertainties was used as the 
reference data set in this process. The timing models of pul- 
sars included astrometric parameters, rotational frequency 
information, dispersion measure information, and Keplerian 
and Post-Keplerian parameters if the pulsar is in a binary 
system. For NANOGrav observed pulsars, “FD” parameters 
were included to minimize the effect of frequency-dependent 
profile variations of pulsars (see Arzoumanian et al. 2015). 
IPTA DR2 produced two data set versions depending on dif- 
ferent methods of handling the dispersion measures (DM) 
variations of pulsars over time (VersionA and VersionB — 
see Perera et al. 2019, for details). In VersionA, the DM 
variations of pulsars were determined using DMMODEL de- 
scribed in Keith et al. (2013) and the noise parameters for 
different data sets were directly taken from their original 
data releases. In VersionB, the DM variations were modeled 
using the first two time derivatives of the DM and a time- 
correlated stochastic DM process in the timing model. The 
noise parameters were also re-estimated based on the new 
IPTA data combination in this version. We use VersionB for 
this work. 


3 DATA-ANALYSIS METHODS 


In this work we follow the conventions established by other 
pulsar timing array data analyses (i.e., Arzoumanian et al. 
2016, 2020; Lentati et al. 2015; Goncharov et al. 2021a). 
The multivariate Gaussian likelihood £(dt|@) is employed 
to model noise and signal contributions, parametrised by 0, 
to the observed timing residuals. Our likelihood was of the 
same form as other PTA analyses, (e.g., Arzoumanian et al. 
2015). We used enterprise (Ellis et al. 2020) to evaluate 
the likelihood and priors, and PTMCMCSampler (Ellis & van 
Haasteren 2017) to perform a Markov chain Monte Carlo 
(MCMC) simulation, drawing samples from the posterior 
probability distribution. Model selection was performed via 
the product-space sampling method (Carlin & Chib 1995; 
Hee et al. 2016). Additionally, we used the Savage-Dickey 
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approximation to the Bayesian evidence ratio when appro- 
priate. 


3.1 Noise models 


For each pulsar, we modeled the TOAs with a combina- 
tion of four processes: the timing model, white noise, in- 
trinsic red noise, and DM variations. Deterministic contri- 
butions from the timing model, described in Section 2, were 
analytically marginalised (van Haasteren et al. 2009). The 
time-uncorrelated white noise was modeled with EFAC and 
EQUAD, and ECORR parameters for NANOGrav pulsars 
with their sub-banded TOAs (definitions of EFAC, EQUAD, 
and ECORR can be found in, e.g., Verbiest et al. 2016; 
Perera et al. 2019). Every observing receiver and backend 
system combination is given its own set of white noise pa- 
rameters. The time-correlated red noise process (e.g., pul- 
sar spin noise, Shannon & Cordes 2010) and stochastic DM 
variations (Keith et al. 2013) were modeled as Fourier basis 
Gaussian processes. In each case Fourier spectrum coeffi- 
cients were modeled as power laws, 


2 —YRN 
Path= mL). (1) 
2 —YDM 2 
Poulin = Spare (E) (uu). e 


where A is the power law amplitude, y its spectral index, 
and fyr = 1 yr! ~ 3.17 x 10^? Hz. The difference between 
these two processes lies in the radio frequency v dependence. 
Intrinsic red noise is achromatic, i.e., frequency independent, 
while DM variations follow a v~? dependence (e.g., Lentati 
et al. 2016). 

Despite all MSPs in IPTA DR2 exhibiting high rota- 
tional stability, such that the marginalized timing model, 
red, and DM noise terms are in general sufficient, certain 
pulsars have been found to experience timing events that 
need to be included in their data model. Of interest to this 
analysis is PSR J1713+0747, which was observed to expe- 
rience multiple sudden drops in apparent DM with an ex- 
ponential recovery (Demorest et al. 2013; Lam et al. 2018; 
Goncharov et al. 2021b). Only the first such event lies within 
the timespan of the IPTA DR2 and was included as an ad- 
ditional deterministic term to the full noise model of PSR 
J1713+0747. The amplitude, epoch and recovery time scale 
of the DM exponential dip are sampled simultaneously with 
the pulsar red and DM noise terms. 

Lentati et al. (2016) found additional sources of red 
noise in IPTA DR1. These include radio frequency band- 
dependent and observing system-dependent terms, which 
may affect measurements of Pan and Ppm, if not modeled. It 
is possible that mismodeling these effects can bias recovery 
of the CP. More prescriptive models for the CP should be 
less affected by this bias. Recent P'TA analyses have included 
more complex red noise and DM variation models, where 
different pulsars in the array use different models (e.g., Ag- 
garwal et al. 2019; Goncharov et al. 2021b). In the name 
of computational efficiency we opted to use the same power 
law models for all pulsars except when absolutely necessary, 
as is the case for PSR J1713+0747. 

IPTA DR2, being the combination of data from multi- 
ple telescopes and many observing systems, has larger model 
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parameter space than its constituent data sets. The large 
number of model parameters and TOAs increases the com- 
putational complexity of the analysis. As we searched for 
long-term processes, such as the GWB, we limited our anal- 
ysis to pulsars whose observation time exceeded 3 years. This 
reduced the number of pulsars from the full 65 in DR2 to 53. 
Additionally, we fixed the white noise parameters (EFAC, 
EQUAD, and ECORR) to median aposteriori values from 
single pulsar analyses. Both of these choices reduced the 
analysis parameter space to a more manageable size. 


3.2 Common-spectrum process models 


In addition to modeling noise intrinsic to the individual pul- 
sars, we also include a red CP that is present in all of the 
pulsars. The source of this process could be the GWB, or 
any other common noise that manifests itself in all pulsars, 
such as clock errors (Caballero et al. 2016; Hobbs et al. 
2020) or errors in the Roemer delays from Solar-system 
ephemeris (SSE) systematics (Tiburzi et al. 2016; Vallisneri 
et al. 2020). The choice of red noise priors also affects the 
recovery of a CP due to covariance between pulsar intrin- 
sic red noise and the CP (Hazboun et al. 2020b; Goncharov 
et al. 2021a). Each of these effects can be distinguished by a 
unique pattern of spatial cross-correlations between pulsars. 
The cross-power spectral density is defined as, 


Sabl f) =TavPcr(f), (3) 


where Pcp is the common-spectrum process and Pas is the 
overlap reduction function (ORF) describing the inter-pulsar 
correlations. 

For some analyses we did not account for any inter- 
pulsar correlations, taking lab = da, to be the identity ma- 
trix. For others, we also included different choices of non- 
diagonal ORFs, such as the quadrupolar Hellings & Downs 
(1983) correlations that describe a GWB, dipolar correla- 
tions associated with SSE errors, or monopolar correlations, 
Tab = 1, associated with clock errors. In some cases we split 
the diagonal auto-correlation part of the ORF from the off- 
diagonal cross-correlation part, treating them as indepen- 
dent processes as a consistency check. When modeling the 
CP using only the auto-correlations, it is possible to analyze 
the data from each pulsar independently, then recombine the 
results to achieve a joint posterior on the CP. We refer to 
this as the factorized likelihood approach. 

We modeled the CP using a Fourier basis Gaussian pro- 
cess, using basis frequencies f = 1/T,2/T,..., where T is 
the timespan between the earliest and latest observation in 
the data set. We model the power spectrum of the CP as a 
power law using Eqn. (1), replacing the pulsar noise ampli- 
tude and spectral index with those from the common process 
Acp and ycr. In this parameterization of the power spec- 
trum the characteristic strain spectrum for the GWB is 


f (3-yep)/2 
helf) = Ace (£) (4) 
yr 
In some cases we fixed yop = 13/3, equivalent to a = 
(3 — ycp)/2 = —2/3, the expected spectrum for a GWB 


composed of circular supermassive binary black holes (Phin- 
ney 2001), and in others we left ycp as a free parameter. 
To determine the number of Fourier frequencies used in the 


power law CP model, we fit the power spectrum with a bro- 
ken power law model. The broken power law is the sum of 
the standard, red power law and a white spectrum. This is 
implemented as a single spectrum with a fixed spectral in- 
dex at low frequencies that smoothly transitions into a flat, 
white noise dominated spectrum at high frequencies: 


A 2 E —'yap l/k ETOR 
Pu) - Ke (4) f+ ( n) | , 


(5) 
where fbena is the frequency where the spectral index of 
the power spectrum changes and « controls the smooth- 
ness of the transition. In this model P(f) ~ f ?€* for 
f < foena; and P(f) constant for f >> frena. As a verifica- 
tion of our power law models, we performed a free spectral 
analysis, where the power at each frequency is fit indepen- 
dently rather than being constrained by a particular spectral 
shape, P(f). 


3.3  Frequentist analyses 


As a comparison for our primary Bayesian data analysis 
pipeline, we performed a frequentist analysis using the noise- 
marginalized optimal statistic. The optimal statistic is an 
estimator for the amplitude of the GWB based on the inter- 
pulsar correlations (Anholm et al. 2009; Demorest et al. 
2013; Chamberlin et al. 2015). Its original derivation as- 
sumed the pulsars have no intrinsic red noise. The noise- 
marginalized optimal statistic uses posterior samples from 
the Bayesian data analysis to marginalize over the pulsars’ 
red noise. It has been shown to more accurately estimate the 
amplitude of the GWB when the pulsars have intrinsic red 
noise (Vigeland et al. 2018), as is the case in IPTA DR2. 


4 RESULTS 


The IPTA DR2 data set with its large number of pulsars 
(53 for this work), long timespan, and various independent 
observing systems offers a wealth of different analysis op- 
portunities. Here we present a selection of analyses to give 
a complete picture of what IPTA DR2 teaches us and how 
it compares to other PTA data sets. 


4.1 IPTA DR2 data set 


We first show results from the full combined IPTA DR2 data 
set using methods which differ in their spectral modeling and 
choice of ORF. The full standard GWB search uses all the 
available information from the auto- as well as the cross- 
terms (which are assumed to follow the HD correlation). As 
the cross-terms, which come from the inter-pulsar correla- 
tions, are the defining feature of the GWB, we insist on their 
presence in order to confidently claim a detection. However, 
the auto-correlations are initially the dominant source of in- 
formation, especially for spectral parameter estimation, and 
the detection of power in them is considered to be the first 
hint of a GWB (Romano et al. 2021). It is important to 
emphasize that detecting the auto-correlations alone is in- 
sufficient to claim a detection of a GWB. 
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Figure 1. IPTA DR2 free spectrum and characteristic strain: 
The top panel shows the power in terms of time of arrival resid- 
uals at frequencies in the nanohertz band for the full PTA. The 
maximum likelihood power law is shown overlaid on posteriors for 
free spectral parameters, a generic model that measures power at 
various frequencies without imposing any empirical model. The 
bottom panel shows the power law and free spectral information 
from above converted into units of characteristic strain, i.e., the 
noise power measured in the same units as GW amplitude. The 
additional line shows the characteristic strain for the detector us- 
ing the noise parameters for the pulsars. The lower limit of all 
violins is a result of the lower bound of the prior range for each 
frequency component. 


4.1.1 Common-spectrum process 


To begin, we apply a free spectral model for the CP, mea- 
suring the amount of power at each sampling frequency in- 
dependently, up to 30 frequency bins, the result of which 
is shown in the top panel of Figure 1. The common red 
noise power can be converted into GW strain using Eqn. 
(4). This alternative representation of the IPTA DR2 anal- 
ysis can be found in the lower panel of Figure 1. For refer- 
ence, we include the predicted sensitivity curve, made using 
hasasia (Hazboun et al. 2019a,b) and the measured white 
noise parameters of the DR2 data set. Note that the noise 
power spectral density used in this curve only contains TOA 
errors, EFAC, EQUAD and ECORR, and does not contain 
any estimates for the red noise, as for many pulsars it is dif- 
ficult, and in fact the point of this analysis, to disentangle 
intrinsic red noise from a GWB. Hence the low-frequency 
end of the sensitivity represents a “best case" scenario for 
comparison. The lowest frequency bin corresponds to the 
longest timespan, and only two pulsars, J1939+2134 (29 yr) 
and J1857--0943 (28 yr), have observation baselines suffi- 
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Figure 2. Bend Frequency: The posterior for the bend frequency 
parameter in a broken power law search is shown. The peak of the 
posterior is at the 13th frequency, 13/T' for the data set, denoted 
by the dashed vertical line. 


cient to probe this frequency. However, both have signifi- 
cant RN with spectral indices ~ 3.3 (Perera et al. 2019). 
Therefore, it is not surprising that we do not confidently de- 
tect any power there, evidenced by the wide tail extending 
to low power and median of ~ 1077 s. However, the second, 
third, fifth, and eighth frequency bins show power well above 
the expected sensitivity of IPTA DR2. This could either be 
the emergence of a GWB or some other unmodeled noise 
process. 

The CP power spectrum can be modeled with a simple 
power law using Eqn. (1). Arzoumanian et al. (2020) have 
shown that the choice of the number of modeled frequen- 
cies can affect the constraints on the power law amplitude 
and spectral index. Thus, we apply the broken power law 
model from Eqn. (5) to find the optimal number of frequency 
bins for the analysis. Figure 2 shows the marginalized poste- 
rior on the bend frequency. We can identify a clear peak at 
the 13th frequency in N/T, corresponding to a frequency of 
1.4 x 107? Hz, indicated by the orange dashed line. For the 
remainder of this work, we will limit the search to use only 
the lowest 13 frequencies with the simple power law model. 
This produces constraints equivalent to an analysis using 
the broken power law, but in a simpler and computationally 
efficient way. 

We have also verified that the addition of the 
BAYESEPHEM SSE model (Vallisneri et al. 2020) to the analy- 
sis does not change our results significantly from an analysis 
that fixed the SSE to DE438. The nearly 30 years of times- 
pan allows for the separation of the SSE effects from other 
correlated signals (Vallisneri et al. 2020). For simplicity, we 
will only show results with DEA38, unless stated otherwise. 

Figure 3 compares the results when using two differ- 
ent ORFs. The model that uses only the auto-correlation 
terms, which we denote CP in Table 1, is very strongly fa- 
vored over a model with only intrinsic pulsar noise and no 
common-spectrum process with log; , Bayes factor of 8.2. 
Despite the large Bayes factor in favor of the CP, this does 
not suffice to claim a GWB detection, as we have only used 
the auto-correlations. This strong evidence only indicates 
that a number of pulsars have red noise with similar spec- 
tral characteristics. We must turn to the cross-correlations 
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Figure 3. Comparison of common-spectrum process parameters when using auto-correlations only and the full auto+cross-correlated HD 
model. left: 2D posterior for common-spectrum process power law parameters. Green lines mark y = 13/3 and Acp = 2.8 x 10-15, while 
the contours represent the 1—, 2— and 3-o confidence intervals. right: 1D posterior for common-spectrum process power law amplitude, 


using fixed spectral index y = 13/3. 


Table 1. Bayes factors model comparison: The table shows the 
logarithmic Bayes factors for a number of model comparisons 
from the hypermodel and factorized likelihood (marked with an 
asterisk*) analyses. The preferred model is on the left side of the 
two models. Brackets indicate the uncertainty in the last digit of 
the Bayes factors. 


Model comparison logio BF 
HD vs CP 0.3111(6) 
CP vs Pulsar Noise 8.2* 
CP vs Monopole 4.67(2) 
CP vs Dipole 2.28(3) 


to determine if this CP is HD correlated as a GWB should 
be. Using the full HD ORF containing both auto- and cross- 
correlations, we find only middling evidence in favor of the 
auto+cross HD model. The log,, Bayes factor for the full 
HD model compared to the auto-correlated only CP is 0.3, 
as shown in Table 1. 

Figure 3a shows the 2D posterior contours of these two 
models are in relatively good agreement. A small shift to- 
wards lower amplitudes and higher spectral index can be 
seen when using the full HD ORF with both auto- and cross- 
correlations. Using the auto-correlation terms only, we find 
Acp = nae x 107? and yop = 3.90.9, where the errors 
represent 9596 credible regions. Using the full HD ORF we 
can constrain the CP power law to Acp = 3.8*$3 x 10? 
and yop = 4.0 + 0.9. 

When we fix the power spectrum index to y = 13/3 as 
shown in Figure 3b, it is clear that full HD model finds a 
systematically lower amplitude. In this case we find an am- 
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Figure 4. The constraints from the split ORF analysis on the 
common process amplitude with spectral index fixed to y = 13/3. 
The posterior from the auto-correlations is well constrained, while 
the posterior from the cross-correlations is unconstrained, but 
prefers amplitudes slightly smaller than the auto-only analysis. 
The effect of this is seen in the amplitude posterior from the full 
auto+cross-correlation model, where the posterior peaks at lower 
amplitude than the auto-only analysis. 


plitude of Acp = 3.2 4 1.0 x 10715 for the auto-correlation 
only analysis and an amplitude of Acp — Pi x 10 7? 
using the full HD ORF, where the uncertainties represent 
the 9596 credible regions. These results are in broad agree- 
ment with published constraints on the CP. A more detailed 


comparison can be found in Section 4.3. 
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Figure 5. Results from the noise marginalized optimal statistic. 
Top: Optimal statistic, A?, distribution for monopole, dipole and 
HD ORFs. The relevant posteriors from the Bayesian split ORF 
analysis are also shown for comparison. Bottom: The signal-to- 
noise (S/N) distribution for monopole, dipole and HD ORFs. 


4.1.2. Split ORF analysis 


Similar to how we may consider the auto-correlation parts 
of the ORF alone, the full ORF can be split into two in- 
dependent processes. In this case the auto-correlation and 
the cross-correlation parts each have their own independent 
amplitude, as was done in Arzoumanian et al. (2020). In the 
HD ORF the auto-correlation part is laa = 1 and the cross- 
correlation parts are suppressed by at least a factor of 2, 
Tab < 0.5. This makes the cross-correlations harder to con- 
strain. Figure 4 shows the posteriors for the two amplitudes 
of a split ORF analysis for fixed y = 13/3, compared to the 
full auto+cross-correlation model. The cross-correlations do 
not have sufficient precision to place constraints on the am- 
plitude of the GWB. However, they do place a 95% upper 
limit of 3.6 x 10^ !? on the GWB when the prior choice is 
taken into account. This is consistent with the amplitude 
derived using the full auto- and cross-correlation model in 
Fig. 4. The auto-correlation terms are much more informa- 
tive. Combining the information from both shifts the ampli- 
tude towards lower values. This shows that the cross-terms 
can contribute to the full GWB search, even if they provide 
less information. The auto-correlations are more likely to be 
affected by intrinsic pulsar noise. Using a more sophisticated 
noise model for each pulsar can help produce a more robust 
estimate on the amplitude of any CP. 
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Table 2. The p-values calculated from various false alarm anal- 
yses of the data set. The measured values of the S/N are com- 
pared to the distribution of 10k analyses where the correlations 
are broken with phase shifts and sky scrambles. Since a monopo- 
lar spatial correlation is uniform across the sky, sky scrambles 
are unable to break the correlations. Hence only the phase shift 
p-value is quoted. 


p, phase shift | p, sky scramble 
HD ORF 0.25 0.24 
Monopole 0.03 = 


4.1.3 Optimal statistic 


Figure 5 shows the amplitudes and signal-to-noise ratio 
(S/N) that are recovered by the pulsar noise marginalized 
optimal statistic (OS) method, which uses cross-correlations 
only. We find no evidence for a dipolar correlated process, 
as the amplitude and S/N for this model are centered on 
0. SSE systematics are expected to manifest at specific fre- 
quencies related to the celestial bodies. The IPTA DR2 data 
set is long enough to probe lower frequencies that should be 
less affected by SSE errors (Vallisneri et al. 2020). The S/N 
= 0.6*,2 for the Hellings-Downs correlation is insufficient 
to claim a detection. This is consistent with the Bayesian 
model selection. The HD amplitude from the OS seems to be 
in tension with the Bayesian results for the auto-correlated 
CP, but consistent with the Bayesian results for the full HD 
model. This strengthens the case that the cross-terms have 
a significant role to play in parameter estimation as well as 
detection confidence. Finally, the OS has the largest S/N 
= 2.0*1* for a monopole with a small amplitude. This can 
be due to the complexity of IPTA DR2 and some amount of 
unmodeled noise. 

As the spatial correlations are not well constrained, see 
Figure 6, both the HD and monopolar correlation can fit 
the data. We have binned pulsar pairs according to their 
angular separation. Increasing the number of pulsars in the 
array as well as better timing of pulsars can help to tighten 
the constraints. 

We test the significance of the OS S/N by performing 
two analyses which estimate the false alarm rate for a given 
S/N. So called phase-shifts and sky scrambles (Cornish & 
Sampson 2016; Taylor et al. 2017) break the correlations be- 
tween the pulsars leaving the red noise power in the pulsars, 
but removing evidence for spatial correlations. By analyzing 
the phase-shifted and sky scrambled data we can determine 
the rate of observing a particular S/N for a type of correla- 
tions in data that has none. These two false alarm studies 
result in p-values, Table 2, too high to conclude that there 
is evidence for any correlations. It is possible that the mea- 
sured HD S/N can therefore arise by chance from a common 
process with no spatial correlation. 


4.2 IPTA DR2 data subsets 


With the large volume of the IPTA DR2 data set we can 
look at different subsets to investigate hints of the origin 
and evolution of the CP signal across different slicings of 
the full data set. 
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Figure 6. Cross-correlation ORF curve from the optimal statis- 
tic. The black points indicate the amount of cross-correlation for 
a given angular separation. Due to the large number of pulsar 
pairs, we have binned multiple pairs with similar angular separa- 
tion. The blue and orange dashed lines show the best-fit values 
for the HD and Monopole correlations. 


4.2.1  Pulsar-based selection 


Since a PTA is made from a number of single pulsars, we can 
look at how each pulsar contributes to the CP by itself. The 
dropout factor gives a measure on how consistent a given 
pulsar's intrinsic red noise is with the CP by comparing a 
model with and without the CP for the pulsar (see e.g., Ar- 
zoumanian et al. 2020). The dropout factors for each pulsar 
computed using both the traditional hypermodel and factor- 
ized likelihood approaches are shown in Figure 7. About 20 
pulsars have factors > 1, while only three slightly disfavor 
the CP, with the remaining pulsars displaying indifference. 

Monte Carlo sampling uncertainties on the dropout fac- 
tors (computed either way) can be estimated through sta- 
tistical bootstrapping (Efron & Tibshirani 1994). In the hy- 
permodel dropout analysis the MCMC chain is re-sampled 
with replacement, generating a new statistical realization 
of the sampled chain that is exceptionally unlikely to be 
identical to the original chain. This process was repeated 
10? times, generating many realizations of the MCMC chain 
from which the dropout factors were computed. Hence a dis- 
tribution of dropout factors over bootstrap realizations was 
generated for each pulsar, allowing us to compute median 
values and 9596 confidence intervals. A similar procedure was 
performed for the factorized likelihood approach. For a given 
bootstrap realization, each individual pulsar's MCMC chain 
was re-sampled with replacement. With the re-sampled CP 
posteriors for each pulsar, the factorized-likelihood approach 
pieces together the dropout factor by iteratively removing 
pulsars from the array, and all by using bootstrapped pulsar 
CP posterior chains. This process was repeated for 10? boot- 
strap realizations across 25 different combinations of meta- 
parameters used in the factorized-likelihood dropout factor 
calculation. The end result is that the median dropout fac- 
tor and 9596 confidence intervals were computed from a to- 
tal of 2.5 x 10* factorized-likelihood dropout values for each 
pulsar. As is seen in Figure 7, all dropout factors are consis- 
tent between both techniques. The vast majority of pulsars 
have dropout factors with overlapping error bars from both 


methods, and those that don't are within a few sigma of 
each other. Those pulsars that show the largest disparity 
are ones for which there were MCMC sampling inefficien- 
cies that manifested in different stages of the dropout factor 
calculation, e.g., in the Savage-Dickey density ratio, or in 
the integral of the (N — 1) array's CP likelihood over the 
posterior of a given pulsar (Taylor et al., in prep.). 

The modularity and speed allowed by the factorized 
likelihood method can be used to approximate different com- 
binations of pulsars within the array. These sub-arrays are a 
useful way to verify and understand the results we see in the 
full array. We created four sub-arrays, consisting of pulsars 
with the highest/lowest dropout factors, longest/shortest 
timespans. These pulsars were selected by sorting all pulsars 
in the array by their dropout and time-span characteristics, 
and then taking the top half of 27 pulsars and the bottom 
half of 26 pulsars. The Savage-Dickey density ratio was cal- 
culated for these sub-arrays to compare them to that of the 
full array. The sub-array made up of the top half of pulsars 
according to dropout factor had a Savage-Dickey density ra- 
tio of 5.6 x 10?, an order of magnitude larger than that of 
the unaltered array, 1.6 x 10°. The corresponding sub-array 
of the bottom half of pulsars based on dropout had a density 
ratio of 1.8. When comparing the pulsars with the longest 
and shortest timespans, the sub-array with the longer times- 
pans had a Savage-Dickey density ratio of 1.8 x 10’ and 
the shorter timespan sub-array had a density ratio of 3.6. 
'These results were not surprising, as the dropout factor is a 
method of measuring the evidence for the array's common 
process in a particular pulsar, so by removing those that 
have low (high) dropout factors, the evidence for the CP 
will increase (decrease). Similarly, pulsars with short times- 
pans are not sensitive to the lowest frequencies explored by 
the array when in combination with longer timespan pulsars. 


4.2.2 Splitting IPTA DR by time 


To test the evolution of the common-spectrum process, the 
DR2 is split into two data sets that have equally long times- 
pans, i.e., cutting the DR2 in two time slices (in a similar 
manner as Hazboun et al. 2020a). The two data sets are 
not fully equivalent though: the early part contains only 19 
pulsars and is mostly dominated by single-radio frequency 
observations, while the second part has data from all 53 
pulsars as well as multi-radio frequency coverage and higher 
quality timing measurements. Each data set is then analysed 
separately. We find that the first half gives little information 
to the CP, with a broad power law 2D posterior contour that 
still encompasses the contour from the full data set. The sec- 
ond half contains the majority of information and produces 
almost identical constraints as the full data set. This is the 
expected evolution as the quality of the data set gradually 
improves over time (Hazboun et al. 2020a). 


4.2.9 Constituent data sets 


We can also select the data that were provided by the 
constituent PTA collaborations to get three data subsets: 
EPTA, NANOGrav and PPTA. As each data subset has a 
different timespan, we set a frequency cutoff at 1.4 x 10 5Hz 
to limit the number of frequencies for the analyses. Figure 8 
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Figure 7. Individual pulsar consistency with common-spectrum process, error bars represent 95% credible intervals. Pulsars with dropout 
factors > 1 contribute to the detection of the CP. Dropout factors of ~ 1 correspond to no evidence for or against the CP, usually due 
to higher white noise levels and/or shorter observation timespans. Pulsars with dropout factors < 1 are in tension with the CP. 
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Figure 8. Comparison of IPTA DR2 to constituent data sets. left: Free spectral common-spectrum process model. The different sampling 
frequencies are a result of the constituent subsets covering different timespans. In all cases the lowest frequency is the inverse observation 


time. right: 2D posterior for CP parameters log-amplitude and spectral index, where the contours represent 1-, 2-, and 3-0 confidence 
intervals. The combined IPTA DR2 data set is more constraining that its parts. 


shows that IPTA DR2 produces the tightest constraints on 
the CP power law compared to the constituent data sets. 
While the PPTA data is still consistent with a upper limit, 
some support for a common red noise can be found with the 
EPTA and NANOGrav data. The free spectra also show 
consistency with a power law model that spans across all 
three constituent data sets and IPTA DR2. 


4.3 Comparison with other recent data sets 


Since the data from the regional PTAs were combined to 
form the IPTA DR2, the regional PTAs have continued to 
collect data and improve their data analysis methodology. 
We can compare the results using the older IPTA DR2 data 
set and the most recent data sets from NANOGrav, PPTA 
and EPTA. Compared to the constituent PTA data sets, the 
recent NANOGrav data set includes ~ 4 more years and 10 
new pulsars, the PPTA expands by ~ 3 years and 7 pulsars, 
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Table 3. Mahalanobis distance between CP parameters (log- 
amplitude and spectral index) for each pair of PTAs. For all cases, 
there is less than 3-sigma separation. 


EPTA PPTA | NANOGrav 
IPTA 0.6 2.6 2.6 
EPTA = 2.3 2.4 
PPTA = = 1.4 


the EPTA DR2 adds ~ 7 years for 6 pulsars (Alam et al. 
2021; Kerr et al. 2020; Chen et al. 2021b). 

The published free spectral and power law model re- 
coveries can be found in Figure 9. For simplicity, we also 
show the recovered amplitudes at the reference frequency 
of 1/(lyr) and fixed yop = 13/3 in Figure 10. The Ma- 
halanobis distance Dm acts as a generalization to compute 
the n-dimensional sigma deviation between two distributions 
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Figure 9. Comparison of IPTA DR2 to other recent data sets. left: Free spectral common-spectrum process model. The inclusion of 
legacy data not used in recent PTA analyses allows IPTA DR2 to reach lower frequencies despite missing the most recently collected data. 
right: 2D posterior for CP parameters log-amplitude and spectral index, where the contours represent the 1—, 2-, and 3-c confidence 
intervals. All recent data sets are in broad agreement on the characteristics of a common-spectrum process. 
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Figure 10. CP amplitude posteriors for fixed spectral index, 
y = 13/3. IPTA DR2 and EPTA DRQ2 find a systematically higher 
amplitude for the common-spectrum process than NANOGrav 
12.5 yr and PPTA DR2, although the disagreement is not sub- 
stantial. 


(Mahalanobis 1936), 


Dy = y (ii — i2)5- (gi — i), (6) 


where ji and gu are the mean vectors of the multivari- 
ate distributions to be compared and X = X + X» is the 
joint covariance. To quantify the overlap and consistency of 
the power law parameters as determined using each dataset, 
the Mahalanobis distance between the 2D posterior distri- 
butions are computed in Table 3. Despite some differences 
the posteriors overlap better than 3-sigma for all pairs of 
distributions. 

IPTA DR2, using older observations, still shows simi- 
lar features as the NANOGrav 12.5, 6-pulsar EPTA DR2 
and PPTA DR2 analyses, which have added a significant 
amount of new data to the regional PTA data sets. A future 
combination of these data sets will boost the total PTA sen- 


sitivity in the same way IPTA DR2 is more sensitive than its 
constituent data sets. Future combined IPTA data sets will 
be important for investigating the origin of this common- 
spectrum process. 


5 DISCUSSION & OUTLOOK 
5.1 Source of the common-spectrum process 


The first IPTA data release did not show signs of a common- 
spectrum temporally-correlated process, but set an upper 
limit of 1.7 x 1071" instead. This appears to be in tension 
with our results from analysis of the second data release 
with a CP amplitude of 2.8 x 10715. However, there are 
two major differences to point out: 1) the different choice of 
priors for the pulsar red, DM and common noise (Hazboun 
et al. 2020b) and 2) the DR1 upper limit was computed 
without the use of a SSE uncertainty model (Vallisneri et al. 
2020). Both of which have been shown to lead to an increase 
in the upper limit, alleviating tensions between the DR1 and 
DR2 CP amplitudes. 

As in other recent PTA analyses, we find strong evi- 
dence in favor of the CP over the noise only hypothesis. It 
is important to note that 1) the lack of support for GW- 
like spatial correlations prohibits any claims of GW detec- 
tion, however 2) this type of evidence for a similar red noise 
is expected to precede a detection of spatial correlations 
(Siemens et al. 2013; Pol et al. 2021; Romano et al. 2021). 

Goncharov et al. (2021a) recently demonstrated that 
the common-spectrum process model is favored over the 
noise-only hypothesis when the noise spectra cluster in a 
similar range, and it is not favored anymore when the noise 
spectra are drawn from the prior distribution. Because we 
know that the employed prior distribution for red noise pa- 
rameters is not representative, it is possible that the evi- 
dence we find for a common-spectrum process is caused by 
a rejection of a null hypothesis rather than by all pulsars 
exhibiting the spatially-uncorrelated component of a GWB. 
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Thus, it is important to examine the single pulsar red 
noise in detail. We have looked at constraints on the simple 
power law models for the pulsars used in the CP search. In 
general, pulsars with detectable intrinsic noise have compa- 
rable or larger noise than the CP; pulsars without red noise 
typically have large amount of white noise, such that the CP 
is ‘hidden‘. One noticeable exception is PSR J2317+1439, 
whose noise spectrum falls clearly below the CP, see also its 
low dropout factor in Figure 7. 

As the search for the common spectrum can be influ- 
enced by pulsar intrinsic noise, especially in an inhomoge- 
neous data set, the crucial analysis has to consider informa- 
tion from the cross-correlations. It should be noted that the 
median amplitudes are slightly different in the analyses with 
and without spatial correlations, 2.8 x 107? vs 3.2 x 10715. 
One can also note the stark difference in the posterior for 
the split ORF analysis, Figure 4 and the optimal statistic 
analyses vs the Bayesian uncorrelated analysis, Figure 5. In 
other analyses (Arzoumanian et al. 2020; Goncharov et al. 
2021a; Chen et al. 2021b) the amplitudes between the two 
analyses are more in line with one another. The difference 
here could be in part due to the very long baselines of only a 
handful of pulsars. This legacy data allows only scant oppor- 
tunity for correlations amongst those few pulsars, while the 
long baselines allow the detection of auto-correlated power, 
even in noisier data. Another possibility is that there is unac- 
counted for noise in individual pulsars that is contaminating 
the signal. Advanced models to take these pulsar noises into 
account for the GWB search have been shown to affect the 
individual pulsar red noise (e.g., Arzoumanian et al. 2020; 
Goncharov et al. 2021a, Chalumeau et al., in prep.). 


5.2 Other Correlated Signals 


Spatial correlations in pulsar timing data have been stud- 
ied in depth in the literature (Tiburzi et al. 2016; Roebber 
2019), and their consideration is an important part of any 
GW detection procedure. While GWs induce a quadrupole- 
dominated set of correlations there are other types of spatial 
correlations between pulsar data sets (Roebber & Holder 
2017; Tiburzi et al. 2016; Roebber 2019). Monopolar spatial 
correlations, i.e., all pulsars seeing the same shifts in resid- 
uals irrespective of sky position, can manifest from clock 
errors, either in the BIPM clock standards, or the various 
observatory clocks used across the world (Hobbs et al. 2020). 
Dipolar spatial correlations can manifest from the error in 
measurement of processes where the motion of the Earth in 
the solar system is important (Caballero et al. 2018). This is 
most direct in the modeling of the solar system barycenter 
frame of reference, into which all pulsar TOAs are trans- 
formed. Errors in solar wind modeling can also add dipo- 
lar correlations (Tiburzi et al. 2016). While monopole and 
dipole correlations are theoretically orthogonal to HD corre- 
lations in the space of overlap reduction functions, real data 
with noise can result in some of these modes mixing (Roeb- 
ber 2019). This mixing could erroneously be detected as a 
GWB. 

The polarization content of the GWB could also devi- 
ate from the two tensor transverse (TT) modes predicted 
by general relativity which lead to the HD spatial correla- 
tions (Arzoumanian et al. 2021b). Deviations from general 
relativity would result in a correlation pattern that differs 
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from HD. We would like to emphasize that the current data 
set does not allow us to draw any conclusions on the pres- 
ence of spatial correlations. As can be seen in Figure 6 the 
uncertainties on the spatial correlation coefficients Ta»(Ẹ) 
determined by the optimal statistic analysis are large. For 
most angular bins the correlation is indistinguishable from 
zero, corresponding to the uncorrelated CP. Close to sub- 
mission of this paper, we noticed a preprint by Chen et al. 
(2021a). They have analyzed the IPTA DR2 searching for 
alternative GW polarizations and claim evidence for spatial 
correlations induced by a scalar transverse (ST) polariza- 
tion mode. Chen et al. (2021a) also report a Bayes factor in 
favor of HD correlations (TT modes) compared to an uncor- 
related CP about six times larger than we find: 12 against 
our 2. Even though we have not searched for the ST mode, 
we would like to highlight that the reported high evidence 
of spatial correlations in Chen et al. (2021a) is contrary to 
what we conclude using the same data set. The scalar trans- 
verse ORF is positive definite and should be accompanied 
by positive evidence for a monopolar correlation in analy- 
ses using only the cross-terms (Arzoumanian et al. 2021b). 
This is the case in our optimal statistic analysis where the 
monopolar correlations have the largest S/N and smallest 
false alarm p-value. In this sense finding some evidence in fa- 
vor of ST correlations is not too surprising, however, we find 
no conclusive evidence in favor of any correlation pattern. 
Using more information in our analysis with both auto- and 
cross-terms disfavors monopolar correlations compared to an 
uncorrelated CP. Therefore, the conclusions of Chen et al. 
(2021a) need to be taken with caution. Finally, Chen et al. 
(2021a) find a Bayes factor in favor of the common process 
to be several orders of magnitude smaller (log,; BF = 4.6 
compared to 8.2) than we do. They use a different pulsar 
noise model, including an additional sinusoidal annual DM 
variation in all pulsars, which could account for some, but 
likely not all, of the differences. 


5.3 Astrophysical Interpretation 


The nHz GWB is generally thought to be dominated by 
GW emission from SMBHBs (Burke-Spolaor et al. 2019), 
with the most massive local SMBHBs expected to be indi- 
vidually observable in the next 5-10 years (Mingarelli et al. 
2017). Given that these systems are just a local subset of 
the cosmological SMBHB population producing the GWB, 
their local number density, $gnup,o, should correlate with 
the amplitude of the GWB. The GWB amplitude induced 
by a cosmological population of circularized SMBHBs can 
be expressed in geometric units (where G = c = 1) as (e.g. 
Phinney 2001; Sesana et al. 2008; Sesana 2013b) 


4 M93 — qdPógug 
Awe = —— dMıdzd 
GWB 3n1/8 III 10% KETE dMidzdq' (7) 


where M; is the mass of the primary SMBH, M» the mass 
of the secondary, q = M2/Mı < 1 is the mass ratio, 
M5 = [q(1 + 9) ?] M97 is the SMBHB chirp mass with 
total binary mass M = Mı + Mo, and d? Opup /(dMidzdq) 
is the differential comoving number density of SMBHBs per 
unit Mi, z, and q. 

'To determine the local number density of SMBHBs im- 
plied by a Acwp & 2.8 x 10-15 background we use the 
quasar-based SMBHB model from Casey-Clyde et al. (2021), 
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Figure 11. The GWB characteristic strain as a function of the local SMBHB number density, ®gyp 0, and the minimum primary BH 
mass, MBH,1 min, and maximum redshift, zmax, of the population contributing Z 95% of the GWB signal. Left: Three representative slices 
of the strain in this parameter space (one along each axis), with solid contours showing their intersection with isosurfaces of constant 
strain value (Acp shown in bold). Right: 3D visualization of the zmax — Mpu,i,min panel from the left and its intersection with an 


Aawp = 2.8 x 1071? isosurface (gray). 
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Figure 12. Comparison of power law constraints versus theo- 
retical SMBHB populations. The 2D amplitude, spectral index 
constraints of the CPs from Figure 9 are compared to the re- 
gion of parameters recovered from a large number of realizations 
of SMBHB population simulations using astrophysical relations 
from (M21 Pop, Middleton et al. 2021) shown in grey contours 
and this work (IPTA DR2 Pop) shown in purples contours. 


which assumes proportionality between SMBHB and quasar 
populations (which may be triggered by galaxy major merg- 
ers, Stemo et al. 2020) over mass and redshift. This has 
the effect of setting AcwsBs « 1/®sup,o, so that Acws 
directly implies @pypio. To check coverage of the entire 
signal from SMBHBs over mass and redshift, we parame- 
terize Aawp = AawB (®pup,o, Mi min, Zmax); where Mi min 
and Zmax are the minimum primary SMBH mass and maxi- 
mum redshift, respectively, in Equation 7 (Casey-Clyde et al. 
2021). We plot this parameterization of the GWB compared 
to various strain measurements in Figure 11, including an 
Aaws £ 2.8 x 107? signal (bold contour, gray isosurface). 


'The 2D panels of Figure 11 show three representative slices 
from this parameter space (one along each axis), with con- 
tours denoting their intersection with isosurfaces of constant 
GWB signal amplitude. The 3D plot shows the bottom right 
2D panel in this 3D parameter space, along with its inter- 
section with an Acwp ^ 2.8 x 10^ !? isosurface. We find 
that recovery of a background amplitude like Acp requires 
Bug, o ~ 1.5 x 1075 Mpc™? (corresponding to the bottom 
right 2D panel in Figure 11), roughly an order of magnitude 
larger than the ~ 1.6 x 107? Mpc^? number density implied 
by Mingarelli et al. (2017). 


Besides this new quasar-based method the standard ap- 
proach to determining the local number density ppp, is to 
model d? gung /(dMidzdq) using major mergers and empir- 
ically observed galaxy and black hole relations (e.g., Simon 
& Burke-Spolaor 2016; Chen et al. 2019; Middleton et al. 
2021). Following the methods from Middleton et al. (2021) 
we analyze the IPTA DR2 CP amplitude. Figure 12 com- 
pares the spread of amplitude and spectral index from the 
IPTA DR2 CP against values recovered from realizations 
of SMBHB population simulations. The original population 
constraints using the NANOGrav (Arzoumanian et al. 2020) 
frequency bins are shown by the grey shaded area. Repeating 
the analysis with the frequency coverage of the IPTA DR2 
gives the purple shaded contours. As we reach into lower fre- 
quencies the simulations become more constrained towards 
the expected spectral index y = 13/3. Limiting the SMBHB 
chirp mass M > 105? Mo in the integral of Equation 7 we 
get ®gupo & 3.0 x 107? Mpc^?, which is about a factor 
of 20 times larger than the number density from Mingarelli 
et al. (2017). 
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5.4 Outlook for other GWBs 


Although we have evidence for a single common process, 
whose amplitude and spectral index are consistent with pre- 
dicted values from a population of SMBHBs with a number 
density o ~ 10-°Mpc~?, other sources could also be plau- 
sible, for example, primordial black holes (e.g., Vaskonen & 
Veermäe 2021; De Luca et al. 2021; Kohri & Terada 2021), 
cosmic strings (e.g., Ellis & Lewicki 2021; Blasi et al. 2021; 
Blanco-Pillado et al. 2021) or phase transitions (e.g., Arzou- 
manian et al. 2021a, and references therein). These sources 
could produce a GWB consistent with the CP contours from 
P'TA data. 

Pol et al. (2021) have shown that the initial confident 
detection of a GWB including HD correlation will place very 
stringent constraints on the properties of the possible source 
of GWBs. It is also possible to have several backgrounds 
affecting the data, splitting the total common power into 
several components. A detailed study into how well we can 
separate multi-component GWBs is underway. 


5.5 Modern Noise Mitigation 


The PTA community continues to develop new data analy- 
sis strategies towards the detection of gravitational waves in 
pulsar timing data. Aggarwal et al. (2019, 2020) showed that 
unmodeled noise features in a single pulsar could leak into 
the gravitational wave channel for deterministic continuous 
GW and GW memory signals respectively. More recently im- 
mense effort was put into the development of individualized 
noise models for PPTA pulsars, demonstrating that sensi- 
tivity can be gained from better modeling (Goncharov et al. 
2021b). Similar advanced noise modeling efforts are cur- 
rently underway in both NANOGrav (Arzoumanian et al., in 
prep.) and EPTA (Chalumeau et al., in prep.). More sophis- 
ticated noise modeling is important, because many types of 
noise can add steep spectral index, low frequency power to 
pulsar data sets, complicating GWB recovery. For example, 
noise from fluctuations due to the interstellar medium or 
time-correlated noise from long-term instrumental effects in 
telescope systems, which could arise from a combination of 
polarisation miscalibration and secular changes in receiver 
gain. 


6 CONCLUSION 


This work shows the immense potential of combining the 
global efforts of PTA collaborations into one data set. Fig- 
ure 1 and Figure 8 show that IPTA DR2 is significantly more 
sensitive than any of the constituent data sets from which 
it is constructed. While the data in DR2 have now been su- 
perseded by more up-to-date efforts, (Alam et al. 2021; Kerr 
et al. 2020; Chen et al. 2021b), Figure 9 shows that the sen- 
sitivity from combining these older data sets is comparable 
with these newer single PTA data releases. 

The conclusions of this analysis are broadly similar to 
the various GWB analyses carried out by the NANOGrav, 
the PPTA and EPTA (Arzoumanian et al. 2020; Goncharov 
et al. 2021a; Chen et al. 2021b). All of these data sets favor 
the CP model over one with only intrinsic red noise in the in- 
dividual pulsars. None of these data sets shows clear support 
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for the spatial correlations indicative of a GWB. Therefore 
a detection of a GWB can not be claimed. The strong de- 
tection of red noise that broadly matches the spectral char- 
acteristic of a GWB from SMBHBs before there is support 
for spatial correlations is expected from our understanding 
of the change in sensitivity of PTAs (Romano et al. 2021; 
Siemens et al. 2013). As was shown in (Pol et al. 2021), if 
the power in the auto-correlations of these pulsars is the first 
sign of the GWB then evidence for the spatial correlations 
should follow in upcoming data sets. If the next individual 
P'TA data sets show increased support, but are short of de- 
tection thresholds, then combining them into an IPTA data 
set could immediately result in a data set with a significant 
detection. Such a combination will have the longest times- 
pan, largest number of pulsars and independent observing 
systems, and will thus enable a robust GWB search. 
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